Newton Methods for Fast Solution of Semi- supervised Linear SVMs

نویسنده

  • Vikas Sindhwani
چکیده

In this chapter, we present a family of semi-supervised linear support vector classifiers that are designed to handle partially-labeled sparse datasets with possibly very large number of examples and features. At their core, our algorithms employ recently developed Modified Finite Newton techniques. We provide a fast, multi-switch implementation of linear Transductive SVM (TSVM) that is significantly more efficient and scalable than currently used dual techniques. We present a new Deterministic Annealing (DA) algorithm for optimizing semi-supervised SVMs which is designed to alleviate local minima problems while also being computationally attractive. We conduct an empirical study on several classification tasks which confirms the value of our methods in large scale semi-supervised settings. Our algorithms are implemented in SVMlin, a public domain software package.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast semi-supervised SVM classifiers using a priori metric information

This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for LInear hyperplane classifier with A-priori Metric information). Our method takes advantage of similarity information to leverage the unlabeled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incor...

متن کامل

A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

This paper develops a fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification. This is done by modifying the finite Newton method of Mangasarian in several ways. Experiments indicate that the method is much faster than decomposition methods such as SVMlight, SMO and BSVM (e.g., 4-100 fold), especially when the number...

متن کامل

Making Logistic Regression A Core Data Mining Tool

Binary classification is a core data mining task. For large datasets or real-time applications, desirable classifiers are accurate, fast, and automatic (i.e. no parameter tuning). Naive Bayes and decision trees are fast and parameter-free, but their accuracy is often below state-of-the-art. Linear support vector machines (SVM) are fast and have good accuracy, but current implementations are sen...

متن کامل

Maximum margin semi-supervised learning with irrelevant data

Semi-supervised learning (SSL) is a typical learning paradigms training a model from both labeled and unlabeled data. The traditional SSL models usually assume unlabeled data are relevant to the labeled data, i.e., following the same distributions of the targeted labeled data. In this paper, we address a different, yet formidable scenario in semi-supervised classification, where the unlabeled d...

متن کامل

Semi-Supervised Structure Learning

Discriminative learning framework is one of the very successful fields of machine learning. The methods of this paradigm, such as Boosting, and Support Vector Machines have significantly advanced the state-of-the-art for classification by improving the accuracy and by increasing the applicability of machine learning methods. Recently there has been growing interest to generalize discrimative le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006